-
Notifications
You must be signed in to change notification settings - Fork 400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(2/3) Add Enum for HTLCHandlingFailed Reasons #3601
base: main
Are you sure you want to change the base?
Conversation
lightning/src/events/mod.rs
Outdated
/// The HTLC was failed by the local node. | ||
Local { | ||
/// The BOLT 04 failure code providing a specific failure reason. | ||
failure_code: u16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit I'm not super excited about exposing the failure code as-is for two reasons - (a) we have very different constraints on the code - we may not always want to give the sender all the details about a failure (eg if the sender used a private channel but we didn't mean to expose it we might pretend the channel doesnt exist) vs what we want to tell the user about a failure (basically everything), (b) its not particularly easy to parse - users have to go read the BOLT then map that to what is actually happening in LDK code, if we had a big enum here we'd be able to provide good docs linking to config knobs and reasons why failures happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, will take a look at adding a more detailed enum or adding to HTLCDestination
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we may not always want to give the sender all the details about a failure (eg if the sender used a private channel but we didn't mean to expose it we might pretend the channel doesnt exist)
Maybe I don't understand this fully, but why would you want to hide information from a local api user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The private channel example is a case where the BOLT 04 code hides information from the sender of the payment, but we do want to surface that information on the API (comment is from a previous approach that just surfaced the BOLT 04 failure code, so wouldn't provide this additional information).
why would you want to hide information from a local api user?
I do think there are some cases where the end user probably doesn't care about very detailed errors. For example, InvalidOnionPayload
- do we really care whether the version is unknown or the onion key is bad when there's nothing that we can do about it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, outdated.
About your second remark - the end user is a user of the library, I assume. Maybe hard to know what they care about. But still it's probably more efficient to wait for someone asking for it than it is to add it potentially unnecessarily. In that context interested to know whether the failure code itself is useful? (posted #3541 (comment))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About your second remark - the end user is a user of the library, I assume.
Yeah, whoever is consuming LDK's events 👍
But still it's probably more efficient to wait for someone asking for it than it is to add it potentially unnecessarily.
Complaint driven development FTW 😅
In that context interested to know whether the failure code itself is useful?
As discussed on call, this does fall into the nice to have category. I do also think it's also a nice readability improvement to get rid of the failure codes scattered through the codebase, reduces the chances of using the wrong values IMO.
lightning/src/events/mod.rs
Outdated
@@ -1441,6 +1461,10 @@ pub enum Event { | |||
prev_channel_id: ChannelId, | |||
/// Destination of the HTLC that failed to be processed. | |||
failed_next_destination: HTLCDestination, | |||
/// The cause of the processing failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its worth noting that HTLCDestination
has some "reason"-like attributes (I meant to mention this in the issue, but apparently forgot), they're just incomplete and too broad. Up to you if you want to try to add more details there vs adding a new enum.
I'm going to be out on vacation until 3 March, but will pick this up when I get back! |
4a14deb
to
26326ea
Compare
Pushing* for a conceptual look because I haven't found a way of doing this that I'm super happy with.
If this looks okay, I'll go ahead and clean up tests+docs and add a similar for * Force pushed because this is a different approach; old version that just surfaces B04 failure codes is here. |
lightning/src/ln/onion_utils.rs
Outdated
/// The raw BOLT04 failure code for the HTLC. | ||
/// | ||
/// See: https://github.com/lightning/bolts/blob/master/04-onion-routing.md#returning-errors. | ||
FailureCode { code: u16 }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, rather than having a variant that just holds the error code (and exposing it), can we just convert every use of error codes to the enum version and then only write it out as a u16
when we go to serialize the failure? This would also have the nice side-effect of ensuring that we have all the error codes we generate written in one place instead of strewn all over the codebase.
Maybe I'm not quite sure I understand why you needed to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an we just convert every use of error codes to the enum version and then only write it out as a u16 when we go to serialize the failure
Yeah can def do that. I thought that some of the error codes were overly-specific (so not something that we want to expose to the end user), but if that's okay then it's a much simpler approach.
Played with this a bit locally out of fear of sending you down a dead end... It looks like it's possible to avoid the nested enums and have one big |
26326ea
to
0882ed5
Compare
0882ed5
to
bb45736
Compare
Rebased because there were some non-trivial conflicts, and added tests + docs. In an effort to keep the scope down, decided not to:
Of all of these, the most compelling to change in this PR is |
In terms of the API for the event, it looks like it would be best to refactor This is a pretty significant refactor though so I agree it would be best to split out some preliminary changes into an initial PR. Do you have any instincts for what would be good to start with? Maybe we could start by just returning the big enum from the internal |
Serialization is also a question here, but I think we can do something similar to what we do for the |
Excited for this change! This will help a lot with observability into forwarding failures.
Definitely interested in exposing the cases inside If this change will expose local failures when forwarding HTLCs, is there a plan to expose failures received at the origin node later on as well? |
lightning/src/ln/onion_utils.rs
Outdated
DroppedPending, | ||
/// The HTLC was failed back because its channel is closed and it has timed out on chain. | ||
ChannelClosed, | ||
/// UnknownFaliureCode represents BOLT04 failure codes that we are not familiar with. We will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// UnknownFaliureCode represents BOLT04 failure codes that we are not familiar with. We will | |
/// UnknownFailureCode represents BOLT04 failure codes that we are not familiar with. We will |
lightning/src/ln/onion_utils.rs
Outdated
/// - We read a deprecated failure code from disk that LDK no longer uses. | ||
/// | ||
/// See <https://github.com/lightning/bolts/blob/master/04-onion-routing.md#returning-errors> | ||
/// for latesst defined error codes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// for latesst defined error codes. | |
/// for latest defined error codes. |
const NODE: u16 = 0x2000; | ||
const UPDATE: u16 = 0x1000; | ||
|
||
/// The reason that a HTLC was failed by the local node. These errors either represent direct, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the local node
Could this potentially be reused later on when exposing the failure reason on the origin side? If so, should the naming be more generic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outbound payments failure reasons are already reasonably covered byPaymentFailureReason
IMO so going to leave as-is for now, especially because the original sender would only get the (less interesting) subset of variants that map directly to bolt 04 codes.
lightning/src/ln/onion_utils.rs
Outdated
PrivateChannelForward, | ||
/// The HTLC was failed because it made a request to forward over the real channel ID of a | ||
/// channel that implements `option_scid_alias` which is a privacy feature to prevent the | ||
/// real |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// real | |
/// real channel ID from being known. |
I think that the advantage of the current Certainly defer to the opinions of the people who've worked on the API longer, but this approach worked reasonably well with LND/LNDMon.
@wvanlint's suggestion to tackle
Just so I'm clear here, this would mean returning
With the one large
This is a much larger refactor than I was expecting, so I also want to make sure that it's going to be worth the review time for the project (esp since I'm new here)? My main goal was to learn more about the codebase, which I have, so it's already been worth it for me! |
Good points there and also appreciate the link to the LND APIs, seeing the example of what works in prod today was helpful! Will get back in more detail on your comment soon, but FYI I finally took a look at the updates from the last big push and most of the commits here already look great. So would def be happy to start with that, especially since this PR is pretty big so it's nice to split it up anyway. How does it sound to start with landing most of the commits prior to the last one that actually surfaces anything in the public API? |
Opinions on API solidified a bit for me after offline discussion, I think that deprecating
We'd be able to map these two fields to
sgtm! Perhaps also adding a refactor of |
API looks great! Thanks a lot for the patience on this.
Definitely think that could make sense! |
Just to be 100% sure we're on the same page -- I chatted with @TheBlueMatt yesterday and he emphasized that (1) the way the current PR code replaces the BOLT 4 raw error codes with the enum is already a solid improvement to readability, and (2) it's really nice that we only have one enum used for both the event and for onion error codes in LN messages. Mainly pointing this out in case there was any risk of rewriting some of the commits here, since they seem to align with these goals as-is. |
This seems fine to me. The reason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
IIUC explicit discriminants require a unique value per variant, so they won't work here because we're mapping many variants to one error code. |
bb45736
to
ad60f71
Compare
ad60f71
to
4c5b4b3
Compare
pub fn queue_add_htlc<F: Deref, L: Deref>( | ||
&mut self, amount_msat: u64, payment_hash: PaymentHash, cltv_expiry: u32, source: HTLCSource, | ||
onion_routing_packet: msgs::OnionPacket, skimmed_fee_msat: Option<u64>, | ||
blinding_point: Option<PublicKey>, fee_estimator: &LowerBoundedFeeEstimator<F>, logger: &L | ||
) -> Result<(), ChannelError> | ||
) -> Result<(), (LocalHTLCFailureReason, String)> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it makes sense to create a specific error for this tuple. I'm not familiar enough with this part of the code base to say that we can add another kind of ChannelError
, but something on these lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basical LGTM
IMO it's a good time to rebase on main this since the first PR landed, this should follow pretty quickly and it looks like the conflicts have accumulated a bit |
4c5b4b3
to
f3010bb
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3601 +/- ##
==========================================
- Coverage 89.05% 89.03% -0.03%
==========================================
Files 155 155
Lines 122019 122130 +111
Branches 122019 122130 +111
==========================================
+ Hits 108666 108740 +74
- Misses 10695 10725 +30
- Partials 2658 2665 +7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No major feedback, looking good here!
lightning/src/ln/onion_utils.rs
Outdated
/// The HTLC was pending on a channel which is now in the process of being closed. | ||
/// It was not fully committed to, so can just be immediately failed back. | ||
DroppedPending, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we'll return this in can_accept_incoming_htlc
, and the docs on that method state that the HTLC has already been irrevocably committed to when the method is called, which seems to contradict this a bit.
I'm also wondering if this variant can be named ChannelClosed
instead? It seems like the channel closing is the real reason that DroppedPending
HTLCs failed. The current ChannelClosed
variant seems more like OnchainTimeout
, but I'm still looking through the code on this one to confirm.
lightning/src/ln/onion_utils.rs
Outdated
/// The HTLC was failed because its amount is less than the smallest HTLC that the channel | ||
/// can currently accept. | ||
LiquidityMinimum, | ||
/// The HTLC was failed because its amount is more than then largest HTLC that the channel | ||
/// can currently accept. | ||
LiquidityMaximum, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/Liquidity/HTLC
on these might be clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally decided agianst this because I think HTLC(min/max) sounds like we've hit channel policy limits on min/max, when get_available_balances
actually incorporate all sorts of things (dust, policy, liquidity etc).
Going with the rename + adding more info to the docs instead 👍
@@ -8821,8 +8813,8 @@ impl<SP: Deref> FundedChannel<SP> where | |||
{ | |||
let send_res = self.send_htlc(amount_msat, payment_hash, cltv_expiry, source, | |||
onion_routing_packet, false, skimmed_fee_msat, None, fee_estimator, logger); | |||
if let Err(e) = &send_res { if let ChannelError::Ignore(_) = e {} else { debug_assert!(false, "Sending cannot trigger channel failure"); } } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to get rid of these unreachable code paths!
lightning/src/ln/onion_utils.rs
Outdated
@@ -1777,8 +1820,19 @@ impl_writeable_tlv_based_enum!(HTLCFailReasonRepr, | |||
(_unused, err, (static_value, msgs::OnionErrorPacket { data: data.ok_or(DecodeError::InvalidValue)?, attribution_data })), | |||
}, | |||
(1, Reason) => { | |||
(0, failure_code, required), | |||
(0, _failure_code, (legacy, u16, | |||
|r: &HTLCFailReasonRepr| Some(r.clone()) )), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is off because we need to write a u16
here? Unfortunately our ser macros currently don't enforce that the type being written matches the legacy type :(
lightning/src/ln/onion_utils.rs
Outdated
if let Some(code) = _failure_code { | ||
let failure_reason: LocalHTLCFailureReason = code.into(); | ||
RequiredWrapper::from(failure_reason) | ||
} else { | ||
reason | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for safety we should return the unknown failure reason here even though (as you note above) the failure_code should always be set in this situation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed this to return InvalidValue
if reason
is None
and we don't have a _failure_code
, similar to the way it's been done for attribution_data
above.
Open to other approaches here too! Still getting comfortable w/ these macros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks! Definitely a bit of a learning curve to these macros at times
f3010bb
to
3a74be6
Compare
Diff for last push here, renamed a few enums as requested + fixed up serialization of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just minor tweaks, will start looking for a 2nd reviewer
lightning/src/ln/onion_payment.rs
Outdated
err_code: 0x4000|22, | ||
reason: LocalHTLCFailureReason::InvalidKeysendPreimage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is an existing bug in the codebase: https://github.com/lightning/blips/blob/master/blip-0003.md and we should be erroring with PERM|15
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled out the fix for this in 11a4f3e then updated in the refactor 👍
lightning/src/ln/channelmanager.rs
Outdated
fn construct_pending_htlc_fail_msg<'a>(&self, msg: &msgs::UpdateAddHTLC, | ||
counterparty_node_id: &PublicKey, shared_secret: [u8; 32], inbound_err: InboundHTLCErr) -> HTLCFailureMsg { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we're trying to adopt rustfmt formatting where possible ahead of actually using it on this file, which would format as such:
fn construct_pending_htlc_fail_msg<'a>(
&self, msg: &msgs::UpdateAddHTLC, counterparty_node_id: &PublicKey,
shared_secret: [u8; 32], inbound_err: InboundHTLCErr,
) -> HTLCFailureMsg {
...
}
lightning/src/ln/channelmanager.rs
Outdated
let htlc_destination = get_failed_htlc_destination(outgoing_scid_opt, update_add_htlc.payment_hash); | ||
htlc_fails.push((htlc_fail, htlc_destination)); | ||
htlc_fails.push((self.construct_pending_htlc_fail_msg(&update_add_htlc, &incoming_counterparty_node_id, shared_secret, inbound_err), htlc_destination)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: separate this new call into an intermediate var?
&self, msg: &msgs::UpdateAddHTLC, counterparty_node_id: &PublicKey, shared_secret: [u8; 32], | ||
decoded_hop: onion_utils::Hop, allow_underpay: bool, | ||
next_packet_pubkey_opt: Option<Result<PublicKey, secp256k1::Error>>, | ||
) -> PendingHTLCStatus { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like another benefit of this refactor is that we can more easily remove the PendingHTLCStatus
struct once its legacy usage in channel.rs
is removed
3a74be6
to
147072f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think - but not sure - that I would have preferred a PR that is just a mechanical substitution of all the u16s for enum values, and inside that a clear separation between code that is only literal search/replace and the rest. Maybe I am overcautious with this change set though.
Then talk about the rest in one or more follow ups.
err_code: 0x4000|22, | ||
err_data: Vec::new(), | ||
err_code: 0x4000 | 15, | ||
err_data: invalid_payment_err_data(amt_msat, current_height), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this commit is small, still separating the refactor from the bug fix is beneficial I think.
@@ -36,6 +36,14 @@ pub struct InboundHTLCErr { | |||
pub msg: &'static str, | |||
} | |||
|
|||
/// Writes payment data for invalid or unknown payment error code. | |||
pub (super) fn invalid_payment_err_data(amt_msat: u64, current_height: u32) -> Vec<u8>{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should return a (code, data) tuple, to prevent bugs where the data is used with a diff code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Returning the code isn't helpful in the commit that follow this because we start to introduce variants like InvalidKeysendPreimage
so this function can't always return IncorrectPaymentDetails
.
@@ -5919,7 +5923,7 @@ where | |||
}) => { | |||
let cltv_expiry = routing.incoming_cltv_expiry(); | |||
macro_rules! failure_handler { | |||
($msg: expr, $err_code: expr, $err_data: expr, $phantom_ss: expr, $next_hop_unknown: expr) => { | |||
($msg: expr, $reason: expr, $err_data: expr, $phantom_ss: expr, $next_hop_unknown: expr) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could isolate the rename. This commit already needs a lot of attention to scan for accidental replacement errors.
lightning/src/ln/onion_utils.rs
Outdated
|
||
impl Into<LocalHTLCFailureReason> for u16 { | ||
fn into(self) -> LocalHTLCFailureReason { | ||
if self == (NODE | 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to think of a map here for readability, but no idea of the implications of that and whether it is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried something like this here. The downside of a map is that you can have a runtime failure to find the code you're looking for if you forget to add something to the map; I felt that the compiler check that match
has all the variants was more compelling than the readability improvement.
Explicit discriminants were also suggested, but they require a unique value per variant.
Could also have this be something like this, which has the benefit of only defining the values once:
if self == LocalHTLCFailureReason::TemporaryNodeFailure.failure_code() {
LocalHTLCFailureReason::TemporaryNodeFailure
}
Looks a little ungodly to me, but happy to consider something different here (I don't love it either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be:
macro_rules! impl_into_u16_for_local_failure {
($($variant:path),+ $(,)?) => {
impl Into<LocalHTLCFailureReason> for u16 {
fn into(self) -> LocalHTLCFailureReason {
$(
if self == $variant.failure_code() {
return $variant;
}
)+
LocalHTLCFailureReason::UnknownFailureCode { code: self }
}
}
};
}
impl_into_u16_for_local_failure!(
LocalHTLCFailureReason::TemporaryNodeFailure,
LocalHTLCFailureReason::PermanentNodeFailure,
... all bolt 04 variants
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily advocating this, but you could use the macro to define the enum as well. This would ensure each one is used, whereas otherwise impl_into_u16_for_local_failure
needs to be updated whenever a new variant is added.
For some creative uses of macros see:
- https://github.com/lightningdevkit/rust-lightning/blob/main/lightning-types/src/features.rs
- https://github.com/lightningdevkit/rust-lightning/blob/main/lightning-custom-message/src/lib.rs
rust-lightning/lightning/src/util/ser_macros.rs
Line 1050 in 83e9e80
macro_rules! tlv_stream {
For instance, you could use a pared down version of composite_custom_message_handler
's rule to specify each variant as something like:
macro_rules! failure_reason {
($($variant:ident($code:expr)),* $(,)*) => {
// ...
}
}
failure_reason!(
TemporaryNodeFailure(UPDATE | 7),
// ...
)
Adding docs may not be as nice. We do something for features like this:
#[doc = $doc] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if you all start threatening with macros, I'll back off 😅
/// We failed to decode the onion payload. | ||
/// | ||
/// If the payload we failed to decode belonged to a Trampoline onion, following the successful | ||
/// decoding of the outer onion, the trampoline_shared_secret field should be set. | ||
Relay { | ||
err_msg: &'static str, | ||
err_code: u16, | ||
reason: LocalHTLCFailureReason, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another rename to isolate potentially
@@ -20,7 +20,7 @@ use crate::ln::msgs::{ | |||
BaseMessageHandler, ChannelMessageHandler, MessageSendEvent, OnionMessageHandler, | |||
}; | |||
use crate::ln::offers_tests; | |||
use crate::ln::onion_utils::INVALID_ONION_BLINDING; | |||
use crate::ln::onion_utils::LocalHTLCFailureReason; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This commit is just so intense. Getting nervous of reviewing so many trivial changes. I wonder if it is worth to:
- Do the renames from err_code to reason separately in a non-review commit
- Introduce the LocalHTLCFailureReason enum, still unused
- Have a commit that contains just mechanical search/replaces that needs no review. Possibly list all the replaces that were applied.
- Then finish with a commit with the semi-manual leftovers.
Maybe it's too much of an ask though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a shot at splitting this up 👍
@@ -1712,6 +1712,49 @@ impl Into<LocalHTLCFailureReason> for u16 { | |||
} | |||
} | |||
|
|||
impl_writeable_tlv_based_enum!(LocalHTLCFailureReason, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps some explanation in the commit message or code why we don't just keep writing a u16 code?
lightning/src/ln/onion_utils.rs
Outdated
| Self::DustLimitHolder | ||
| Self::DustLimitCounterparty | ||
| Self::FeeSpikeBuffer | ||
| Self::ChannelNotReady => UPDATE | 7, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a fundamental design choice. To allow multiple reasons to map to the same bolt code. Somehow it suggests a layering violation, but perhaps this is just the practical way to do it.
It does make this commit more than the basic 'replace u16 with enum'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a fundamental design choice. To allow multiple reasons to map to the same bolt code.
Indeed, this is one variant because it allows us to surface this enum directly on the API. Without that, we'd need some sort of wrapper enum which allows either a direct BOLT 04 mapping or an "enriched" failure that then maps to the BOLT04 mapping.
enum Failure {
Bolt04(Bolt04Failure)
Enriched(LocalFailure)
}
Since we want to surface additional information about the errors, we have to accept that this many to one mapping has to happen somewhere - it either happens in one enum, or across two.
The above Failure
structure also doesn't really make sense to surface to the end user, so we'd then need to map this further to a flat enum, which adds a lot of boiler plate.
Somehow it suggests a layering violation, but perhaps this is just the practical way to do it.
While it does feel a bit like a layering violation because these values end up on the API, it is also just a more accurate way of representing internal error states. The BOLT04 error codes erase a lot of information that's internally available, so perhaps the actual layering violation is to use wire-level error codes internally? For example, this change also gives us more precise unit testing because we can now assert that the TemporaryChannelFailure
we're receiving is actually the one we're expecting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The BOLT04 error codes erase a lot of information that's internally available, so perhaps the actual layering violation is to use wire-level error codes internally?
I love this inversion. Sold.
lightning/src/ln/onion_utils.rs
Outdated
| LocalHTLCFailureReason::DustLimitHolder | ||
| LocalHTLCFailureReason::DustLimitCounterparty | ||
| LocalHTLCFailureReason::FeeSpikeBuffer | ||
| LocalHTLCFailureReason::ChannelNotReady => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again a bit of that layering friction, at least that is how I perceive it.
lightning/src/ln/onion_utils.rs
Outdated
} | ||
} | ||
|
||
impl Into<LocalHTLCFailureReason> for u16 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer From
over Into
as per the Into
docs: https://doc.rust-lang.org/std/convert/trait.Into.html
lightning/src/ln/onion_utils.rs
Outdated
|
||
impl Into<LocalHTLCFailureReason> for u16 { | ||
fn into(self) -> LocalHTLCFailureReason { | ||
if self == (NODE | 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily advocating this, but you could use the macro to define the enum as well. This would ensure each one is used, whereas otherwise impl_into_u16_for_local_failure
needs to be updated whenever a new variant is added.
For some creative uses of macros see:
- https://github.com/lightningdevkit/rust-lightning/blob/main/lightning-types/src/features.rs
- https://github.com/lightningdevkit/rust-lightning/blob/main/lightning-custom-message/src/lib.rs
rust-lightning/lightning/src/util/ser_macros.rs
Line 1050 in 83e9e80
macro_rules! tlv_stream {
For instance, you could use a pared down version of composite_custom_message_handler
's rule to specify each variant as something like:
macro_rules! failure_reason {
($($variant:ident($code:expr)),* $(,)*) => {
// ...
}
}
failure_reason!(
TemporaryNodeFailure(UPDATE | 7),
// ...
)
Adding docs may not be as nice. We do something for features like this:
#[doc = $doc] |
Introduce an enum representation of BOLT04 error codes that will be introduced in the commits that follow, along with being extended to represent failures that are otherwise erased by BOLT04 error codes (insufficient liquidity which is erased by temporary channel failure, for example).
Now that we're passing around more than a BOLT04 error code, rename the field accordingly.
Now that we're not passing BOLT04 error codes around, rename the field accordingly.
Now that we're not passing around error codes, update variable names accordingly.
Update the failure reason that's persisted in HTLCFailReasonRepr to store our newly added enum, rather than the old u16. In the commits that follow, we'll add more variants create a many-to-one mapping from enum to BOLT04 code (as we're surfacing internal information that is intentionally erased by the BOLT04 codes). Without this change to persistence, we'll lose the detail that these variants contain by storing them as their erased BOLT04 code.
Sometimes the error codes that we return to the sender intentionally obscure information about the payment to prevent probing/ leaking information. This commit updates our internal representation to surface some of these failures, which will be converted to their corresponding bolt 04 codes when they're sent over the wire.
To be able to obtain the underlying error reason for the pending HTLC, break up the helper method into two parts. This also removes some unnecessary wrapping/unwrapping of messages in PendingHTLCStatus types.
147072f
to
125fb63
Compare
Broke the large commit up into smaller ones, no changes in logic except for implementing |
This PR adds an enum to represent all possible BOLT 04 error codes, along with some variants that surface additional information that will be useful to the end user.
Depends on #3699
API changes in #3700